Home |
| Latest | About | Random
# 35 Similarity, determinant and trace. ## Definition and interpretation of similar matrices. First recall its definition > When two square matrices $A$ and $B$ satisfy the relation $$ A=PBP^{-1} $$for some invertible matrix $P$, then we say $A$ is **similar** to $B$, and write $A\stackrel{\text{sim}}\sim B$. > When this happens, $A$ and $B$ are just matrix representations of the same linear map, but represented over different basis sets. > > Furthermore, if we think of $A=[T]_{std}^{std}$ as the standard matrix of some linear map $T$, and $B=[T]_{\beta}^{\beta}$ as the matrix representation of $T$ with respect to some ordered basis $\beta$, then the matrix $P$ such that $A=PBP^{-1}$ consists of columns precisely the vectors in the ordered basis $\beta$, with diagram $$ \begin{array}{rcl} \underset{(std)}{\mathbb R^{n}} & \xrightarrow{A} & \underset{(std)}{\mathbb R^{n}} \\ P^{-1}\big\downarrow &\quad ⟲ \quad & \big\uparrow P \\ \underset{(\beta)}{\mathbb R^{n}} & \xrightarrow{B} & \underset{(\beta)}{\mathbb R^{n}} \end{array} $$(careful with arrow directions.) First let us concern with "how to tell if two matrices are similar". **Example.** Let us look at the matrices $$ A = \begin{pmatrix}3 & 0 \\ 0 & -1\end{pmatrix} \quad\text{and}\quad B = \begin{pmatrix}1 & 2\\ 2& 1\end{pmatrix}. $$These are the two same matrices that we saw a while ago in [[smc-spring-2024-math-13/linear-algebra-notes/32-change-of-basis-part-0-motivation-and-similarity|notes 32 (section tale of two matrices)]], which you would recall that they do similar things: $$ \begin{array}{} Ae_{1}=3e_{1} & Ae_{2} = -e_{2} \\ Bb_{1}=3b_{1} & Bb_{2} = -b_{2} \end{array} $$where $b_{1}=\begin{pmatrix}1\\1\end{pmatrix}$ and $b_{2}=\begin{pmatrix}1\\-1\end{pmatrix}$. We will show that indeed $A$ is similar to $B$ by finding explicitly an invertible matrix $P$ such that $A = PBP^{-1}$. To do this, note the condition $A=PBP^{-1}$ is equivalent to $AP = PB$. So by setting $P=\begin{pmatrix}a & b\\c& d\end{pmatrix}$, this amounts solving a linear system, $$ \begin{array}{c} \begin{array}{rrcl} & AP & = & PB \\ \implies & \begin{pmatrix}3 & 0\\0 & -1\end{pmatrix}\begin{pmatrix}a & b\\c & d\end{pmatrix} & = & \begin{pmatrix}a & b\\c & d\end{pmatrix} \begin{pmatrix}1 & 2\\2 & 1\end{pmatrix} \end{array} \\ \implies \left\{\begin{array}{} 3a = a+2b \\ 3b=2a+b \\ -c = c+2d \\ -d=2c+d \end{array}\right. \end{array} $$ This system gives $2a=2b$, so $a = b$, and $-2c=2d$, so $c=-d$. Namely we have $$ \begin{align*} a & = b \\ b & = \text{free} \\ c & = -d \\ d & = \text{free} \end{align*} $$and hence $P$ has the form $$ P=\begin{pmatrix}b & b\\-d & d\end{pmatrix} $$Now we see if we can assign the free variables $b,d$ to some value so that $P$ is invertible. Indeed, we can take $b=1,d=-1$, giving $$ P = \begin{pmatrix}1 & 1\\-1 & 1\end{pmatrix} $$an invertible matrix. One can see this quickly by noting $\det(P)=2\neq0$. Hence $A=PBP^{-1}$, and $A$ is similar to $B$ ! But wait, there is more, note that the columns of $P$, $\begin{pmatrix}1\\-1\end{pmatrix}$ and $\begin{pmatrix}1\\1\end{pmatrix}$ are precisely the vectors in the basis set $\beta$ ! Except one oddity, the orders are switched. However, we see that if we pick $d=-1$ instead, we would have gotten the same ordering. This is because $A$ is diagonal, which allows us to relax the ordering. But in general we do have some flexibility and constraint on what the basis $\beta$ is as we have those free variables to construct the invertible $P$. $\blacklozenge$ **Example.** (A) Show that $A=\begin{pmatrix}1&1\\0&1\end{pmatrix}$ is similar to $B=\begin{pmatrix}1&2\\0&1\end{pmatrix}$ by finding an explicit invertible $P$ such that $A=PBP^{-1}$. (B) Find an ordered basis $\beta$ such that if $A$ is the standard matrix of some linear map $T$ , then $B$ is the matrix representation $[T]_{\beta}^{\beta}$. $\blacktriangleright$ (A) We set up with $P = \begin{pmatrix}a&b\\c&d\end{pmatrix}$ and $AP=PB$, so we get $$ \begin{pmatrix}1 & 1\\0 & 1\end{pmatrix}\begin{pmatrix}a & b\\c & d\end{pmatrix}=\begin{pmatrix}a & b\\c & d\end{pmatrix}\begin{pmatrix}1 & 2\\0 & 1\end{pmatrix} $$This gives $a=\frac{1}{2}d$, $b=\text{free}$, $c=0$, $d=\text{free}$, so $$ P=\begin{pmatrix} \frac{1}{2}d & b \\0 & d\end{pmatrix}. $$Picking $d=2,b=0$ gives an invertible matrix $$ P=\begin{pmatrix}2 & 0\\0 & 1\end{pmatrix} $$as we can easily check by determinant. So $A=PBP^{-1}$. (Many choices of $P$). (B) Since the columns of $P$ give precisely the ordered basis $\beta$ such that if $A =[T]_{std}^{std}$ is the standard matrix of some linear map $T$, then $B=[T]_{\beta}^{\beta}$, so we have $$ \beta = (\begin{pmatrix}2\\0\end{pmatrix},\begin{pmatrix}0\\1\end{pmatrix}). \quad\blacklozenge$$ **Example.** Is $A=\begin{pmatrix}1 & 2\\1&1\end{pmatrix}$ similar to $B= \begin{pmatrix}1&1\\1&2\end{pmatrix}$ ? $\blacktriangleright$ Suppose $AP=PB$ for some $2\times 2$ matrix $P = \begin{pmatrix}a&b\\c&d\end{pmatrix}$. Then we have $$ \begin{pmatrix}1 & 2\\1 & 1\end{pmatrix}\begin{pmatrix}a & b\\c & d\end{pmatrix}=\begin{pmatrix}a & b\\c & d\end{pmatrix}\begin{pmatrix}1 & 1\\1 & 2\end{pmatrix} $$This gives a system of equations $$ \begin{align*} a+2c = a+b \\ b+2d = a+2b \\ a+c = c+d \\ b+d = c+2d \end{align*} $$which gives trivial solution only $a=b=c=d=0$, so $P = \begin{pmatrix}0&0\\0&0\end{pmatrix}$. So we cannot make $P$ invertible such that $AP=PB$. Whence $A$ is **not similar** to $B$ ! $\blacklozenge$ ## Similarity is an equivalence relation, and remarks on canonical forms. Similarity on square matrices of size $n\times n$ is an **equivalence relation**, that is to say > For $n\times n$ matrices $A,B,C$, of the same size, > (1) $A \stackrel{\text{sim}}\sim A$ (reflexive) > (2) If $A \stackrel{\text{sim}}\sim B$ then $B \stackrel{\text{sim}}\sim A$ (symmetric) > (3) If $A \stackrel{\text{sim}}\sim B$ and $B \stackrel{\text{sim}}\sim C$, then $A \stackrel{\text{sim}}\sim C$ (transitive) To prove these, it amounts to writing down an appropriate invertible matrix $P$ that realizes the similarity. $\blacktriangleright$ Proof. (1) Note that $A=IAI^{-1}$, where $I$ is the identity matrix with $I^{-1}=I$. So we see that $A \stackrel{\text{sim}}\sim A$. (2) Suppose $A \stackrel{\text{sim}}\sim B$, this means $A=PBP^{-1}$ for some invertible matrix $P$. Solving for $B$ we get $B= P^{-1}AP = P^{-1}A(P^{-1})^{-1}$. Since $P^{-1}$ is invertible we see that $B \stackrel{\text{sim}}\sim A$. (3) Suppose $A \stackrel{\text{sim}}\sim B$ and $B \stackrel{\text{sim}}\sim C$. Then we have invertible matrices $P,Q$ such that $$ A = PBP^{-1} \quad\text{and}\quad B=QCQ^{-1}. $$Substitution yields $$ A=PQCQ^{-1}P^{-1}=(PQ)C(PQ)^{-1}. $$And since $PQ$ is invertible, we have $A \stackrel{\text{sim}}\sim C$. $\blacklozenge$ Since $\stackrel{\text{sim}}\sim$ is an equivalence relation on $n\times n$ matrices, this means we can partition the set of all $n\times n$ matrices into **similarity classes**, where each class contains $n\times n$ matrices that are all similar to each other. In general, equivalence relations allow us to classify matrices into equivalence classes. **Remark.** An interesting question now arises, is there a "best representation" for each similarity class of $n\times n$ matrices? That is, is there a **canonical form** for each similarity class? (The analogy with **row equivalence**, every row equivalence class we have a unique reduced row echelon form RREF, that best represents said row equivalence class.) The answer to what a canonical form would be for each similarity class is **nuanced**, and in fact depends on if the base field are the reals or complex (or in general the scalars we are allowed to use). To give you a taste of these canonical forms look like, I will claim, but not prove, the following amazing results > **Real canonical form.** > If $A$ is an $n\times n$ real square matrix, then $A$ is similar to a real canonical form that is **block diagonal** (blocks of matrices on the diagonal) that looks like $$ \begin{bmatrix}B_{1} & & & & & 0\\ & B_{2} \\ & & B_{3} \\ & & & \ddots \\ 0 & & & & & B_{k}\end{bmatrix} $$where each $B_{i}$ is a **square matrix**, and each $B_{i}$ is either a Jordan block of the form $$ \begin{bmatrix}\lambda & 1 & & & & 0\\ &\lambda & 1 \\ & & \lambda & 1 \\ & & & \ddots & \ddots \\ 0& & & & \lambda & 1\end{bmatrix} $$for some real number $\lambda$, or $B_{i}$ is a complex Jordan block of the form $$ \begin{bmatrix}R & I & & & & 0\\ &R & I \\ & & R & I \\ & & & \ddots & \ddots \\ 0& & & & R & I\end{bmatrix} $$where $R = \begin{bmatrix}a & -b\\b&a\end{bmatrix}$ and $I=\begin{bmatrix}1&0\\0&1\end{bmatrix}$ is the $2\times 2$ identity matrix, for some real numbers $a,b$. > The different subblocks $B_{i}$ may have different sizes, different choice of $\lambda$ or $a,b$. But within each block $B_{i}$, the values of $\lambda$ or $a,b$ are fixed. If we allow complex numbers the result is cleaner > **Jordan canonical form.** > If $A$ is an $n\times n$ real or complex matrix, then $A$ is (complex) similar to Jordan canonical form, that is **block diagonal** (blocks of matrices on the diagonal) that looks like $$ \begin{bmatrix}B_{1} & & & & & 0\\ & B_{2} \\ & & B_{3} \\ & & & \ddots \\ 0 & & & & & B_{k}\end{bmatrix} $$where each $B_{i}$ is a **square matrix**, and each $B_{i}$ is a Jordan block of the form $$ \begin{bmatrix}\lambda & 1 & & & & 0\\ &\lambda & 1 \\ & & \lambda & 1 \\ & & & \ddots & \ddots \\ 0& & & & \lambda & 1\end{bmatrix} $$for some complex number $\lambda$. I hope one day you take second course in linear algebra that discuss this. What we will focus on however, are **similarity classes of matrices that are similar to a diagonal matrix** that looks like this $$ \begin{bmatrix}\lambda_{1} & \\ & \lambda_{2} & \\ & & \lambda_{3} \\ & & & \ddots \\ & & & & \lambda_{n}\end{bmatrix} $$for some scalars $\lambda_{i}$ (and depending if we allow complex numbers, possibly complex). Not every square matrix is similar to a diagonal matrix, but we like to know when is one similar to a diagonal matrix. Such matrices are called **diagonalizable**, and we will return to this topic next. ## Properties of similar matrices. Let us assume all matrices in this section are square and same sized. Let us record some properties of similarity, which allows us to quickly tell if two matrices are not similar to each other. These are necessarily but not sufficient conditions. > **Some properties of similarity.** > Let $A$ and $B$ be $n\times n$ matrices. > (1) If $A \stackrel{\text{sim}}\sim B$ and $A$ is invertible, then $B$ is also invertible. > (2) If $A \stackrel{\text{sim}}\sim B$, then $\det(A) = \det(B)$. > (3) If $A \stackrel{\text{sim}}\sim B$, then $\operatorname{tr}(A)= \operatorname{tr}(B)$, where $\operatorname{tr}(M)$ is the trace of the matrix $M$, where you add up the diagonal entries of $M$. Properties (1) and (2) are straightforward, while (3) we need to deduce a property of trace: > **A cyclic property of trace.** > For any two square $n\times n$ matrices $A,B$, we have $$ \operatorname{tr} (AB)=\operatorname{tr}(BA). $$and as a consequence, $$ \operatorname{tr} ((A_{1}A_{2}\cdots A_{k})(B_{1}B_{2}\cdots B_{\ell})) = \operatorname{tr} ((B_{1}B_{2}\cdots B_{\ell})(A_{1}A_{2}\cdots A_{k})) $$ Warning. This does not mean we can permute the matrices in any order! That is, we do not generally have $\operatorname{tr}(ABCDE)=\operatorname{tr}(BAEDC)$! Think about how this ordering is not derivable from the property of trace. Rather, trace preserves only cyclic permutations! However, we indeed have $\det(ABCDE)=\det(BAEDC)$ as by product property of determinants, both equals the scalar $\det(A)\det(B)\det(C)\det(D)\det(E)$. Ok, let us assume this property of trace first, and prove our properties of similarity. $\blacktriangleright$ Proof of similarity properties. (1) If $A \stackrel{\text{sim}}\sim B$, then $A=PBP^{-1}$ for some invertible $P$. Then solving for $B$ we get $B=P^{-1}AP$, which is a product of three invertible matrices, so $B$ is also invertible. One can see this by determinants $$ \det(B)=\det(P^{-1}AP)=\det(P^{-1})\det(A)\det(P) $$where each of the terms on the right hand side is not zero. (2) If $A\stackrel{\text{sim}}\sim B$, then $A=PBP^{-1}$ for some invertible $P$. Then $$ \det(A) = \det(PBP^{-1}) = \det(P)\det(B)\det(P^{-1}) $$and recall that $\det(P^{-1})=\frac{1}{\det (P)}$ (because $1=\det(I)=\det(PP^{-1})=\det(P)\det(P^{-1})$), so we have $$ \det(A) = \det(B). $$ (3) If $A\stackrel{\text{sim}}\sim B$, then $A=PBP^{-1}$ for some invertible $P$. Then $$ \operatorname{tr}(A)= \operatorname{tr}(PBP^{-1}) = \operatorname{tr}(BP^{-1}P)=\operatorname{tr}(B) $$by cyclic property of trace. $\blacklozenge$ **Warning.** The converse of each of these three properties are **false**. So just because two matrices are invertible, they are not necessarily similar to each other; just because two matrices have the same determinant, does not necessarily mean they are similar to each other; and just because two matrices have the same trace, does not necessarily mean they are similar to each other! Find counterexamples to each statement. Later we will discover more properties of similarity. ## Properties of trace, and comparison with determinant. We close this section by proving the cyclic trace property. We do it with the good old entrywise definition of matrix product. First recall the definition of trace: > If $M$ is an $n\times n$ square matrix whose $(i,j)$-th entry is $m_{ij}$, then $\operatorname{tr}(M)=\sum_{i=1}^{n}m_{ii}$. $\blacktriangleright$ Proof of cyclic property of trace, $\operatorname{tr}(AB)=\operatorname{tr}(BA)$. Suppose matrix $A$ has $(i,j)$-th entry denote as $a_{ij}$, and $B$ has $(i,j)$-th entry as $b_{ij}$, both $n\times n$. Then $\operatorname{tr}(AB)$ is adding up the $(i,i)$-th entry of $AB$, which is $$ \begin{align*} \operatorname{tr}(AB) & = \sum_{i=1}^{n}(AB)_{ii} \\ & =\sum_{i=1}^{n} \sum_{t=1}^{n}a_{it}b_{ti} \\ & = \sum_{t=1}^{n} \sum_{i=1}^{n} a_{it}b_{ti} \\ & = \sum_{t=1}^{n} \sum_{i=1}^{n} b_{ti} a_{it} \\ & = \sum_{t=1}^{n}(BA)_{tt} \\ & = \operatorname{tr}(BA). \end{align*} $$ Isn't it nice! $\blacksquare$ Trace is in fact a cousin of determinant. In comparison with determinant, we have the following Assuming all matrices here are the same square size $n\times n$. Then - Determinant is multiplicative: $\det(AB)=\det(A)\det(B)$. - Determinant is not additive: We do not expect $\det(A+B)$ equals $\det(A)+\det(B)$. - Determinant is unchanged under any permutation of the products: $\det(A_{1}A_{2}A_{3}\cdots A_{n})=\det(A_{\sigma(1)}A_{\sigma(2)}A_{\sigma(3)}\cdots A_{\sigma (n)})$ for any permutation $\sigma \in \mathfrak S_{n}$ - Determinant tells us invertibility: $\det A \neq 0$ if and only if $A$ invertible. - Determinant is invariant under transpose: $\det(A^{T})=\det(A)$. - If $A\stackrel{\text{sim}}\sim B$, then $\det(A)=\det(B)$. And, - Trace is additive: $\operatorname{tr}(A+B)=\operatorname{tr}(A)+\operatorname{tr}(B)$. - Trace is not multiplicative: We do not expect $\operatorname{tr}(AB)$ to equal $\operatorname{tr}(A)\operatorname{tr}(B)$. - Trace is only unchanged under cyclic permutation of the products: $\operatorname{tr}(A_{1}A_{2}\cdots A_{n})=\operatorname{tr}(A_{2}A_{3}\cdots A_{n}A_{1})$, but not general permutations. - Trace does not tell us anything about invertibility. - Trace is invariant under transpose: $\operatorname{tr}(A^{T})=\operatorname{tr}(A)$. - If $A\stackrel{\text{sim}}\sim B$, then $\operatorname{tr}(A) = \operatorname{tr}(B)$. ## Big picture. Comparison to row equivalence. Both matrix similarity $\stackrel{\text{sim}}\sim$ and row equivalence $\stackrel{\text{row}}\sim$ are **equivalence relations**, meaning both are (1) reflexive, (2) symmetric, and (3) transitive. However they do not imply one another. That is, just because $A \stackrel{\text{sim}}\sim B$, it does not necessarily imply $A\stackrel{\text{row}}\sim B$. And just because $A\stackrel{\text{row}}\sim B$, it does not necessarily mean $A\stackrel{\text{sim}}\sim B$. You should find counterexamples to these. But because they are equivalence relations, they partition matrices into **equivalence classes**, and each class have a common feature. Here is the breakdown: - The equivalence classes of $\stackrel{\text{row}}\sim$ (on augmented matrices) describe system of equations that have exactly the same solution behavior. (This is why we use row-reduction to solve system of equations -- the solution set is retained!) - The equivalence classes of $\stackrel{\text{sim}}\sim$ (on square matrices) describe linear maps that behave the same, but just represented over different ordered basis sets. The idea of equivalence relation is important in math because it allows us to **categorize** a large collection into buckets of things (**equivalence classes**) that have a commonality given by the equivalence relation. And often, it is enough to study one representative of each equivalence class to understand the whole class (the idea of **canonical representation** or **canonical form**). Other equivalence relations that you know of from math: - Equality - Similar geometric figures - Congruent geometric figures - Integer numbers with the same remainder after dividing by a positive integer $p$. Perhaps the "first" notion of equivalence relation in human existence is the counting numbers. Ask yourself what is the idea of "three-ness", or "seven-ness". We are able to extract the essence of "three buffalos" and "three fingers" and "three berries" all to the same "three-ness", despite three buffalos are not the same as three fingers! They are now all represented by the canonical symbol "3". (Of course, if we do care about the "buffalo"-ness, then 3 buffalo is not equivalent to 3 fingers!)